Method and apparatus for generating a summary from a document image

ABSTRACT

A summary of a captured document image is produced on the basis of detected handwritten annotations made to a document prior to image capture. The scanned (or otherwise captured) image is processed to detect annotations made to the document prior to scanning. The detected annotations can be used to identify features, or text, for use to summarize that document. Additionally, or alternatively, the detected annotations in one document can be used to identify features, or text, for use to summarize a different document. The summary may be displayed in expandable detail levels.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] Cross-reference is made to U.S. patent application Ser. No.09/AAA,AAA, entitled “Method And Apparatus For Processing Documents”(Attorney Docket No. D/99632), which is hereby incorporated herein byreference.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] This invention relates to processing a scanned image of adocument (for example a paper document) to generate a document summaryfrom the scanned image.

[0004] 2. Description of Related Art

[0005] There are many occasions in which it would be desirable tocompile automatically a summary of a document. Several approaches forsuch systems have been proposed in the prior art.

[0006] For example, European Patent Application EP 0902379 A2 describesa technique in which a user is able to mark certain words or phrases inan electronic version of a document (for example ASCII text), which thesystem then extracts to compile a document summary. However, such asystem requires the user to work with an electronic version of thedocument. Furthermore, the document must already exist in the electronicform before any words or phrases can be selected by the user.

[0007] Regarding the summarizing of paper documents (or scanned imagesof paper documents), reference may be made to the following documents:

[0008] U.S. Pat. Nos. 5,638,543 and 5,689,716 describe systems in whichpaper document images are scanned and the images are processed usingoptical character recognition (OCR) to produce a machine-readableversion of the document. A summary is generated by allocating “scores”to sentences depending on critical or thematic words detected in thesentence. The summary is generated from the sentences having the bestscores.

[0009] U.S. Pat. No. 5,848,191 describes a system similar to U.S. Pat.No. 5,689,716 using scores to rank sentences, the score being dependenton the number of thematic words occurring in a sentence. However, inU.S. Pat. No. 5,848,191, the summary is generated directly from thescanned image without performing OCR.

[0010] U.S. Pat. No. 5,491,760 describes a system in which significantwords, phrases and graphics in a document image are recognized usingautomatic or interactive morphological image recognition techniques. Adocument summary or an index can be produced based on the identifiedsignificant portions of the document image.

[0011] “Summarization Of Imaged Documents Without OCR” by Chen andBloomberg, in Computer Vision and Image Understanding, Vol. 70, No. 3,June 1998, on pages 307-320, describes an elaborate technique based onfeature extraction and scoring sentences based on the values of a set ofdiscrete features. Prior information is used in the form of featurevector values obtained from summaries compiled by professional humansummary compilers. The sentences to be included in the summary arechosen according to the score of the sentence.

[0012] The above paper based techniques all employ variations ofstatistical scoring to decide (either on the basis of OCR text or on thebasis of image maps) which features, or sentences, should be extractedfor use in the complied summary.

SUMMARY OF THE INVENTION

[0013] In contrast to the above techniques, one aspect of the presentinvention is to generate a summary of a captured (e.g., scanned) imageof a document on the basis of detected handwritten or electronicannotations made to a document prior to scanning.

[0014] In more detail, the captured image is processed to detectannotations made to the document prior to image capture. The detectedannotations can be used to identify features, or text, for use tosummarize that document. Additionally or alternatively, the detectedannotations in one document can be used to identify features, or text,for use to summarize a different document.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015] These and other aspects of the invention will become apparentfrom the following description read in conjunction with the accompanyingdrawings wherein the same reference numerals have been applied to likeparts and in which:

[0016]FIG. 1 is a schematic block diagram of a first embodiment forprocessing a paper document to generate a summary of the document;

[0017]FIG. 2 is a schematic flow diagram showing the process forgenerating the summary;

[0018]FIG. 3 is a schematic view of an annotated page of a document;

[0019]FIG. 4 is an enlarged schematic view of a portion of FIG. 3illustrating extraction of a sentence; and

[0020]FIG. 5 is a schematic diagram illustrating options for displayingthe summary.

DETAILED DESCRIPTION

[0021] Referring to FIG. 1, a system 10 is illustrated for generating asummary from a paper document 12. The system comprises an opticalcapture device 14 for capturing a digital image (for example a bitmapimage) of each page of the paper document 12. The capture device 14 maybe in the form of a digital camera, or a document scanner.

[0022] The system 10 also includes a processor 16 for processing thecaptured digital image to generate a summary therefrom. The processor iscoupled to one or more operator input devices 18 (for example, akeyboard, or a pointing device) and also to one or more output devices20 for outputting the generated summary. The output devices 20 may, forexample, include a display unit and/or a printer.

[0023] In contrast to the prior art, one of the principles of thisembodiment is to generate the summary on the basis of annotations madeby hand to the paper document prior to scanning (or capture) by theoptical capture device 14. The processor 16 processes the digital imageto detect hand annotations indicating areas of interest in the paperdocument. Text or other features indicated by the annotations areextracted and used to compile the summary. The summary thereforereflects the areas of interest identified by the hand annotations in thepaper document.

[0024] Referring to FIG. 2, the process for creating the summary by theprocessor 16 comprises a first step 30 of identifying in the captureddigital image, the annotations made by the user. Suitable techniques foridentifying annotations are described, for example, in U.S. Pat. Nos.5,570,435, 5,748,805 and 5,384,863, the contents of which areincorporated herein by reference. These patents disclose techniques fordistinguishing regular machine printing from handwritten marks andannotations.

[0025]FIG. 3 illustrates the kind of hand annotations which can beidentified typically, which include underlining 32, circling 34,bracketing 36, margin bracketing or marking 38, cross-through 40,anchored arrows indicating place changes 42, and handwritten notes orinsertions 44.

[0026] At step 46 (FIG. 2), interpretation of the annotations is carriedout. The level of interpretation may vary from one embodiment toanother, depending on the complexity of annotation permitted by thesystem 10. For example, simple word underlining 32 or circling 34 doesnot need interpretation, as the words are identified directly by theannotations. Bracketing 36 and margin marking 38 requires only simpleinterpretation as identifying the entire text spanned by the brackets ormarking.

[0027] Cross-through annotations 40 are preferably interpreted as anegative annotation, for excluding the crossed-through text from thesummary. This may be regarded in one respect as being equivalent to noannotation at all (and hence not drawing any focus to the text forinclusion in the summary). However, a cross-through annotation 40 alsoprovides a way of excluding one or more words near a highlighted wordfrom being included as part of the contextual text (FIG. 4).

[0028] Place change arrows 42 and handwritten notes or insertions 44also require interpretation to identify the respective positionsidentified by the annotations.

[0029] At step 48 (FIG. 2), regions of the digital image identified bythe interpreted annotations are extracted for use in the summary. Eachregion is referred to herein as a “feature”, and is an image map of theextracted region from the digital image. In addition, each feature istagged with a pointer or address indicating the place in the originallyscanned image from which it is extracted (or copied).

[0030] If an annotation identifies only a single word, or a shortphrase, then the extracted feature for that annotation is preferablyexpanded to include additional contextual information or text for theannotation. Normally, the feature will be expanded to include thesentence 50 (FIG. 4) around the annotation. Therefore, at step 48, theprocessor 16 identifies the location of full stops and other machineprinted marks or boundaries indicating the start and finish of asentence.

[0031] Although FIGS. 3 and 4 only illustrate annotation of text in adocument, one or more graphic portions of the document may also beannotated to be included in the summary. In such a case, at step 48, animage map corresponding to the annotated graphic “feature” is extracted.

[0032] At step 52, the summary is compiled from the extracted features.The summary may be compiled in the form of image maps of the extractedfeatures, or text portions of the features may be OCR processed togenerate character-codes for the text. Similarly, handwritten notes orinsertions may be OCR processed to generate character-codes, or they maybe used as image maps.

[0033] During compilation, any further interpretation of the annotationswhich may be required can be carried out. For example, anycrossed-through text can be deleted (removed) from the summary (forexample, the crossed through text 40 in FIG. 4).

[0034] Additionally, during compilation, identically annotated featuresmay be itemized, for example, with bullets. For example, sentencescontaining circled words may be organized together as a bulleted list.Such an operation is preferably a user controllable option, but this canprovide a powerful technique enabling a user to group items ofinformation together in the summary simply by using the same annotationfor marking the information in the original document.

[0035] Additionally, during compilation, parts of the summary may behighlighted as important, based on the annotations made by hand. Forexample, annotations such as an exclamation mark (54 in FIG. 3) ordouble underlining may be included in the summary as importance marking,for example, by bold or underlined text, or text in a different color.

[0036] At step 56, the compiled summary is outputted, for example, onthe user's display or printer.

[0037] In this embodiment, the system 10 provides a plurality of layereddetail levels in a window 57 for the summary, indicated in FIG. 5. Theselayers may be applied either during compilation, or during outputting ofthe summary information.

[0038] The lowest detail level 58 merely includes any subject headingsextracted from the document.

[0039] By clicking on any subject heading, the subject heading isexpanded to its second detail level 60 to generate the text summary ofthat appropriate section of the document. The second detail level 60only includes text features. However, by clicking again, the summary isexpanded (third detail level 62) to include non-text features as part ofthe summary, such as annotated figures from that section of thedocument.

[0040] By clicking on any sentence, the summary is expanded (fourthdetail level 64) to display further context for the sentence, forexample, by displaying the paragraph containing the sentence.

[0041] In a final layer (fifth detail level 66), the annotationassociated with any sentence in the document may be “retrieved” byclicking on the sentence.

[0042] In an alternate embodiment the plurality of layered detail levelsfor the summary may be accessed simply by clicking on each level ofdetail set forth in the window 57 shown in FIG. 5. That is window 57 maybe used to both indicate a current level of detail being used tosummarize a document as well as access a particular level of detail.

[0043] In the present embodiment, the summary is based on annotationsmade to the document to be summarized. However, in other embodiments,the summary may be made based on annotations made to a differentdocument, for example, a previously annotated document or a masterdocument. In such an embodiment, a first document is annotated by hand,and the annotations are detected and stored by the system 10. A seconddocument is then captured by the system, and the second document isprocessed based on the annotations detected from the first document. Inother words, the annotations detected in the first document are used asa guide for generation of the abstract of the second document (in thesame manner as if the hand annotations had been made to the seconddocument). Further information about this technique is described in U.S.patent application Ser. No. AAA,AAA (Attorney Docket No. D/99632)entitled “Method And Apparatus For Forward Annotating Documents”, whichis hereby incorporated herein by reference.

[0044] The invention has been described with reference to a particularembodiment. Modifications and alterations will occur to others uponreading and understanding this specification taken together with thedrawings. The embodiments are but examples, and various alternatives,modifications, variations or improvements may be made by those skilledin the art from this teaching which are intended to be encompassed bythe following claims.

1. An apparatus for generating a summary of a document, comprising: animage capture device for capturing an image of a document; a processingdevice for detecting annotations made to the document prior to imagecapture; and a summary generator for generating a summary of a documentbased on the detected annotations.
 2. The apparatus according to claim1, wherein the summary generator is operative to generate a summary ofthe same document as that on which the annotations are detected.
 3. Theapparatus according to claim 1, wherein the summary generator isoperative to generate a summary of a different document as that on whichthe annotations are detected.
 4. The apparatus according to claim 3,wherein the image capture device is operative to capture an image of asecond document to be summarized based on the detected annotations fromthe first document.
 5. The apparatus according to claim 1, wherein theprocessing device is operative to identify an image region associatedwith a detected annotation.
 6. The apparatus according to claim 5,wherein the image region represents a sentence in the document image toprovide context for the identified annotation.
 7. The apparatusaccording to claim 1, wherein the summary generator is operative togenerate a summary comprising portions which are selectively expandableto increase the information in that portion of the summary.
 8. A methodof generating a summary of a document, comprising: capturing an image ofa document; detecting annotations made to the document prior to imagecapture; and using the detected annotations in the generation of asummary of a document.
 9. A method according to claim 8, wherein thedocument summarized is same document as that on which the annotationsare detected.
 10. A method according to claim 8, wherein the documentsummarized is a different document from that on which the annotationsare detected.
 11. A method according to claim 10, further comprisingcapturing an image of a second document to be summarized based on thedetected annotations from the first document.
 12. A method according toclaim 8, wherein said detection comprises identifying an image regionassociated with a detected annotation.
 13. A method according to claim12, wherein the image region represents a sentence in the document imageto provide context for the identified annotation.
 14. A method accordingto claim 8, further comprising generating a summary comprising portionswhich are selectively expandable to increase the information in thatportion of the summary.